Multiple copy 2-state discrimination with individual measurements 
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We address the problem of non-orthogonal two-state discrimination when multiple copies of the 
unknown state are available. We give the optimal strategy when only fixed individual measurements 
are allowed and show that its error probability saturates the collective (lower) bound asymptotically. 
We also give the optimal strategy when adaptivity of individual von Neumann measurements is 
allowed (which requires classical communication), and show that the corresponding error probability 
is exactly equal to the collective one for any number of copies. We show that this strategy can be 
regarded as Bayesian updating. 

PACS numbers: 03.67.Hk, 03.65.Ta 



I. INTRODUCTION 

Measurement is a central tenet of quantum mechanics. 
As for any sensible theory of nature, it links abstract 
ideas to reality and makes mathematical concepts truly 
physical. In contrast to classical measurements, which 
(ideally) have no demolishing effect whatsoever, in the 
quantum realm any attempt to acquire information from 
a system alters it to a degree proportional to the gain 
of information. Moreover, this gain is limited Q . Given 
a single copy of an unknown quantum state it is usually 
impossible to determine it by performing any conceivable 
measurement. Nevertheless, if an increasing number of 
copies of such state is available, our knowledge of the 
state can also increase by the use of various measurement 
strategies, and complete determination can be achieved 
in the asymptotic limit when the number of copies goes 
to infinity. 

Measurement strategies involving multiple copies of a 
quantum state fall into two categories: collective and 
individual (local), depending on whether a single mea- 
surement is performed on all the copies as a whole or 
the strategy consists of individual measurements each of 
them performed separately on a single copy. Since the 
pioneering work of Helstrom [2(, and Peres and Woot- 
ters 0, it has been repeatedly shown that collective 
strategies outperform individual ones. This should not 
come as a surprise, since the latter can be viewed as a 
subset of the former, which are completely general and 
unconstrained. Collective measurements, however, are 
difficult to implement experimentally, and a great deal 
of effort go into designing optimal strategies involving 
only individual measurements. Common examples are: 
quantum tomography Q, and (local) adaptive strate- 
gies [f| (where the choice of each individual measurement 
is based on the outcomes of the previous), the two of them 
in the context of quantum state estimation. The state- 
of-the-art of these approaches can only compete with col- 
lective strategies in the asymptotic limit. 

Many practical applications, however, do not require 



a full determination of a state. For instance, to asses 
the security of a key distribution protocol in quantum 
cryptography one gives full advantage to Eve, the 
eavesdropper. Hence, one usually assumes she knows the 
set of possible states that will be used in a secret trans- 
mission (e.g., in the B92 protocol Q this set consists of 
two non-orthogonal states), and her task is to discrim- 
inate 8] among them. She can follow two different ap- 
proaches: use a strategy based on quantum hypothesis 
testing (unconclusive discrimination), which gives the 
lowest probability of error, or do unambiguous (or con- 
clusive) discrimination [t| , namely, adopt a strategy that 
does not tolerate errors. 



When the number of copies is greater than one (as is 
the case of a non completely attenuated laser pulse which 
may be split in several identical single- photon states), the 
discussion above concerning individual versus collective 
strategies becomes again an issue. In this paper we fo- 
cus our attention on this situation. To be more concrete, 
we will consider a hypothesis-testing approach to (non- 
orthogonal) two-state discrimination under the assump- 
tion that we have N identical copies of the transmitted 
quantum state. We will find the best adaptive strategy, 
i.e., a particular case of strategies that use local opera- 
tions and classical communication (LOCC for short), and 
we will show that it is optimal regardless the number of 
copies, in the sense that its error probability and that 
of the optimal collective strategy are exactly the same 
for any N. A similar result was obtained by Brody and 
Meister 01 f° r Bayesian updating. Our result could be 
seen as its extension to general adaptive strategies. How- 
ever, we will prove the remarkable result that the whole 
class of adaptive strategies has actually a single element: 
Bayesian updating. 

If classical communication is not allowed, we show 
that optimality holds asymptotically for the fixed mea- 
surement strategy named unanimity vote, which we also 
present here. 
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II. PRELIMINARIES 



B. Several copies. Collective measurements 



We will start by reviewing some known facts about 
two-state discrimination, including a few technical de- 
tails, which will help us to introduce the notation. 



A. One copy 

By choosing the appropriate orthonormal basis, any 
two states \^po), l"0i) (which we will assume to be neither 
orthogonal nor parallel) can always be written as 

|Va) =cos%) + (-l) Q sin%); a = 0, 1; (1) 

regardless the dimension of the Hilbert space J£ they be- 
long to, where the unit vectors \x), \y) are the elements 
of the basis that span the plane 7 formed by \ipo), |"0i)- 
Now, we ask ourselves what the best measurement for dis- 
criminating between \ipo) and \ipi) is. It can be defined 
in terms of two orthonormal vectors, {|ct>i(0)), |cji(1))}, 
which also belong to 7, and thus can be written as 

\u>i(a)) = cos (0o - a^J |x) + sin (</> - a^j \y). (2) 

In our approach, by 'best measurement' we mean 
the measurement that maximizes the probability of dis- 
crimination, Pi = X^ =0 q a p(a\a) = El=oP( a - B ) [ or 
equivalently the one that minimizes the error probabil- 
ity Pi = 1 — Pi). Here q a is the prior probability of 
\ip a ) being (secretly) transmitted, p(0|0) and p(l|l) are 
the conditional probabilities of obtaining the outcomes 
or 1 given that the unknown state is |"0o) or |-0i) re- 
spectively, and p(0, 0) and p(l, 1) are the correspond- 
ing joint probabilities. The subindex 1 in |wi(a)) and in 
the probability of discrimination/error emphasizes that 
so far we are dealing with just one copy of the unknown 
state. Throughout this paper, boldfaced random vari- 
ables will denote the outcomes of our measurement; thus, 
e.g., p(l|0) is the (a posteriori) probability of the trans- 
mitted state being \tpi) given that the outcome of our 
measurement is 0. Using elementary quantum mechan- 
ics, the conditional probabilities p(a\b) can be computed 



to be p(a\b) = \(ui(a)\ip b )\' 
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a7r/2- (-l) fc 



The optimal measurement and the corresponding proba- 
bilities of discrimination and error are given by 

cos 20o = — 7; cos 20, sin 20o = — „ sin 20, (3) 



Rn 



Pi = \(l + Ro), P 1 = ^(l-R ), (4) 

where R — [{qo — qi) 2 +4q qi sin 2 20] 1 / 2 . In terms of the 
overlap between \ipo) and |-0i), defined as c = |(V'o|V , i)l = 
cos 28, the factor Rq can be written as 



-Ro = \A - Moqic 2 . 



(5) 



In the simple case where go = Qi = 1/2, we have = 7r/4 
and, thus, \uj(a)) = {\x) + (— l) a \y)}/V2, as one would 
expect. 



Let us next suppose that TV copies of either | Vo) or IV'i) 
are available to us. In full analogy with we define 



cos©|X) + (-l) Q sin6|y), (6) 



where \X), \Y) belong to a conveniently chosen basis of 
3{® N . In this situation Eqs. © and Q also hold if we 
replace 9 and c with the corresponding uppercased vari- 
ables O and C = cos 2®. In terms of the new basis 
{\X),\Y),...}, the vectors |Q(a)) (a = 0,1), which de- 
fine the measurement on the N copies in full analogy 
with \u>i(a)), are also given by J5J (uppercasing u>, x and 
y). This defines a collective measurement, since in gen- 
eral |0(a)) is not a product state. We obviously have 
C = |(* |*i)| = \{ipo\il>i) N \ = c N and thus conclude 
that the error probability for this optimal collective mea- 
surement is H 



pcol 
' Y — 



1-y/l- Aq qiC 2N 



Since c < 1, in the large N limit we note that 

qoqic 2 



pcol J2N 



N 



(7) 



(8) 



III. INDIVIDUAL MEASUREMENTS 
A. Fixed measurements 

If we are only allowed to perform the same individ- 
ual measurement on each of our N copies, one could ex- 
pect that the lowest probability of error we can achieve 
is P}^ d — T) c N , where the constant r\ is not relevant 
for the discussion here. This belief may stem from the 
widespread use of the statistical overlap as a measure of 
distinguishability; from a statistical analysis of the prob- 
lem at hand, one concludes that the probability of error 
is bounded by X N , where A depends on the specific in- 
dividual measurement we are performing. The statisti- 
cal overlap is a particularly convenient choice of A (see 
below). Optimizing over all possible measurements one 
finds that A = c for two pure states. This bound is at- 
tained by a majority-vote strategy: we perform the best 
individual measurement, given by @ and © on each 
copy and get N a times the outcome a. Once the mea- 
surement process is complete, we decide in favor of the 
state \ip a ) whose corresponding N a is greatest. 

However, there exist tighter bounds for the exponen- 
tial decrease of the probability of error. The best one is 
known as Chernoff bound , which for the problem at 
hand is given by A = min a ^~J b p(b|0) Q p(b|l) 1_Q , where 
< a < 1 (the statistical overlap is a particular simpli- 
fication of this expression obtained by setting a = 1/2). 
We now note that if we assume qo > qi, with the choice 
|w(0)) = \ipo), |cD(l)) = \ipQ ) for our measurement we 
have p(O|0) = 1, p(l|0) = 0, p(0|l) = c 2 , and a = 
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gives the absolute minimum (over all measurements and 
over all values of a) of the sum over b above. Thus, 
pmd ^ ^ ag f or co vj ec tive measurements. 

There is a simple strategy that saturates the Chernoff 
bound: unanimity vote. Let ourselves perform the mea- 
surement defined by {p(a))} on each of our N copies. 
If we always obtain the outcome (A^o = N), we claim 
that the unknown state is IV'o)- However, if we obtain 
the outcome 1 once or more than once, we decide in fa- 
vor of \ip%). 

The exact probability of error is straightforward to 
compute as follows. Let us assume again qo > qi- If the 
unknown state were |^o)> we would make no error. If the 
unknown state were (it happens with probability q±), 
we would give the wrong answer only if Nq = N, which 
happens with probability c 2N . Hence, the probability of 
error would be qic 2N . If q± > qo, we just exchange the 
subscripts and 1 everywhere. The error probability is 
then 

P# d =mm(q ,qi)c 2N . (9) 

We note that asymptotically P^ d may be larger than 
P™ 1 only because of the prefactor min(g ,<7 1 ) > go<7i> 
which is not important in most situations. This result 
has application in the assessment of the security of some 
quantum cryptographic protocols |T^ . 

B. Adaptive measurements 

So far, we have shown that the performance of indi- 
vidual and collective strategies is essentially the same for 
large ensembles of identical states. We now show that 
if we are not restricted to perform the same individual 
measurement on each copy, and we use the information 
we are gathering to optimize these measurement step by 
step, the overall performance is exactly the same as for 
the optimal collective strategy, regardless the number of 
copies of the unknown state. One could reach this conclu- 
sion by using the algebraic results in to trade |fi(ffl)) 
for a set of product states similar to those in below. 
We follow here a different approach since we would like 
to present a constructive procedure within the framework 
of probability. 

We consider the simplest scenario where we perform 
always von Neumann measurements on each individual 
copy. The final outcomes are binary sequences or strings 
of length N, e.g., Oil • • • 01. Let us denote them by x. 
The strategy is designed in such a way that the last out- 
come (leftmost binary digit in x) determines whether our 
guess is |-0 O ) or We have 

P n= E {«bP(Ox|0) + gip(lx|l)}, (10) 

where 'ad' stands for adaptive, £j r is the set of binary 
strings of length r, and Ox, lx are the strings obtained 
by appending 0, respectively 1, to the left of the string x. 



Quantum mechanics tells us that the conditional prob- 
ability of obtaining the set of outcomes x £ L r if the 
initial state were jV'b) is p{x\b) = \(Q(x)\ip^}\ 2 , where 

|fi(x)) = |w(ar r )) ® |w(x r _i)) <g> • •• ® (11) 

Xk is the substring of length k (0 < k < r) consisting of 
the k rightmost digits of x, and 

\w(ax)) = cos {^t> x - a^j \x) +sin (<p x - a~) \y), (12) 

in analogy with (J2J). Note that <f> x , the angle that de- 
fines the measurement r + 1, depends only on the list of 
outcomes, x, of the previous r individual measurements. 
One readily sees that J2 x eL r Mx))(n(x)\ = 1 in JC g)r , 
which implies that 

5>(ac|&) = 1, (13) 

X 

as it should be. We start with r = (XL contains only the 
empty string 0) and set 0g = (f>o, as defined in Eq. J3J|, 
which gives the optimal measurement for one copy. For 
t > 0, 4> x will be determined by requiring optimality step 
by step. We now can write 

l 

P n=Y1 E p(x,a)Max)\1> a )\*, (14) 

a— x£H js; _i 

where p(x, a) is the joint probabilities of \ip a ) being trans- 
mitted and we obtaining the (partial) outcome list x. 
Namely, p(x,a) = q a p(x\a) = q a ]X S=1 |(u;(a: s )|^ a )| 2 (as- 
suming x E £i r ). Eq. (|14fl can be written in terms of the 
angles 8 and <j) x using Eqs. and (|12|l . Maximizing 
over <fi x , we obtain 

p(x,0) -p(x,l) 

cos20 x = — c, (15) 

K(x) 

where 

R(x) = y/\p(x,0) +p{x,l)} 2 -Ap(x,0)p{x,l)c 2 , (16) 
and we also have 

. , P(x,0)+p(x,l) . 
sin 2(j> x = -^j—^ sin 29. (17) 

Substituting back in (|14fl we obtain 

p *=\ + \ E R W> ( 18 ) 

where we have used that ^2 x p{x, a) — q a , which follows 
from ljT3|l . Eqs. (JT3J, (|Td1) and (fT5f) are analogous to 
Eqs. 10 and (JIJ. Actually, the later can be seen as a 
particular case of the former if we define p(0, a) = q a 
(this definition is sensible, since the empty binary string 
means that no measurement has yet been performed). 
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Having set up this framework, one can prove our main 
result. Namely, that this adaptive strategy gives exactly 
the same error probability as the optimal collective one 
for any N. A straightforward calculation yields 

P (oM) = ^(i + ( -ir+» 



p(x,b) + (l-2<?)p(x,b®l) \ 

R(x) /' 1 JJ 

where © stands for sum mod 2, and one can prove by 
induction the relation 

q a q 1 c 2r [p(x 1 0)+p(x,l)} 2 -p(x,0)p(x,l) = 0, (20) 

for x G Zi r , which is obviously satisfied for r = 0. 

Using this relation in lltj|) and recalling again 
that ^2 x p{x,a) — q a , we finally have the result P^f — 

pcol 
r N ■ 

It is not difficult to show that 



cos 2(f> x — (— l) v c < 



l-4 g0gl c 2r 
1 - 4 gog ic 2 -+ 2 ' 



(21) 



where i r is the leftmost digit in x £ L r and we have used 
that sign[p(cc,0) — p(x, 1)] = (— l) 1 '' [Note that sign( g0 — 

qi) = (-l)H ' 

We immediately realize that the actual dependence 
of the individual measurement r + 1 on previous out- 
comes is extremely simple: it is just a function of the 
r-th outcome, i.e., of i r , rather than a function of the 
whole binary sequence x. In this sense, the optimal one 
step adaptive scheme is 'Markovian'. It is thus con- 
venient to change the notation and define <p r = <f> x , 
\oj r +i{a)) = \^(ax)), for x e L r . Eq. (|12|l becomes 



7T 

a 2 



\uj r +i(a)) — cos (^f>r — a—^j \x) + sin (^(f> r — 

where subscript r + 1 refers to the measurement on copy 
r + 1 and a = 0, 1 is the corresponding outcome. Eq. 
is a particular case of this equation. 



(22) 



C. Bayesian updating interpretation 

Finally, we would like to show that the adaptive strat- 
egy we have presented has a natural interpretation as 
Bayesian updating (we refer to flQ] for an alternative 
point of view). This, along with the results of the previ- 
ous section, proves that Bayesian updating is the unique 
solution to the recursion relations (|19fl that define the 
best adaptive strategy. 

Note that our knowledge of the system, which changes 
after each measurement, is encoded in the a posteriori 
probabilities of \ip a ) being the unknown state given that a 
specific outcome has occurred when performing the mea- 
surement on, say, the r-th copy. We will show below that 
these a posteriori probabilities can be identified with P^ d 



and P^ d . Assuming this for the time being, we might be 
tempted to take a Bayesian point of view and use P^ d 
to update our prior probabilities for the next measure- 
ment. Hereafter, we drop the superscript 'ad' to further 
simplify the notation. 

Suppose we have got the first copy of the unknown 
state. Our optimal measurement will be defined by <po in 
Eq. J3J|. If we obtain the outcome i\ = 0, we will update 
our priors using the rule go ~~ > p(0|0) = Pi, and we will 
use again J3J to optimize the measurement on the second 
copy (similarly, if the first outcome is %\ — 1, we will view 
p(l|l) = P\ as our prior q\ for the second measurement). 
Hence, the second measurement is defined by cos 2^>i = 
(-1)^0 |Pi - Pi |(1 - 4PiPic 2 )^ 1 / 2 , and we obtain that 
the discrimination (error) probability after the second 
measurement is P 2 = [l + (l-4PiPi) 1 / 2 ]/2 (P 2 = [1-(1- 
4PiPi) 1 / 2 ]/2). This updating of the prior probabilities 
can be carried out step by step until we run out of copies. 
At step r we will have 



cos2(/v = (-1)* 



| P r P r 

R, 



(23) 



where by analogy with R(x), we have defined R r = (1 
4P r P r c 2 ) 1/2 , and we obtain 



P r+1 = (1 + R r )/2. 
This leads to the recursion relation 

P r+ i = ^\-{l-R 2 r )c\ 



(24) 



(25) 



whose solution can readily be seen to be R r = [1 — 
4<7o<7ic 2r+2 ] 1/2 , and we again find that P^ d = P%> 1 . 

We still need to show that the a posteriori probabilities 
indeed coincide with P r . It suffices to prove it for the case 
r = 1, where this statement amounts to Pi = p(0|O) = 
p(l|l). This result follows from the obvious formula 



Pi =p(0|0)p(0)+p(l|l)p(l), 



(26) 



where p(b) is the probability of obtaining the outcome b, 
if the 'detailed balance' relation 



p(0|0)=p(l|l) 



(27) 



holds for the optimal scheme. Let us prove this is the 
case. 

Using Bayes formula we can cast l|27|l as 



|( Wl (0)|Vo)| 2 <?o |K(l)|Vi)| 2 <Zi 



p(0) 



p(l) 



(28) 



We further note that the probabilities of obtaining 
the outcome a can simply be written as: p(ct) — 
J2 b l(wi(a)|V>b)| 2 <76. Therefore, Eqs. |2ZJ) and (J2SJI are 
equivalent to 



|(a;i(0)|V>i)| 2 gi = |(^ 1 (l)|-0 o )| 2 g o 
|( Wl (0)|^ >| 2 go |<«i(l)|Vi)| 2 «i" 



(29) 
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This, in terms, is equivalent to 

(go — <Zi ) sin 2(f> cos 29 = (qo + qi ) cos 2(j) sin 28, (30) 

which obviously holds for the optimal strategy [see 
Eq. J2J)] , and concludes the proof. 

IV. CONCLUDING REMARKS 

In summary. Multiple-copy two-state discrimination 
strategies based on individual measurements can be as 
good as the best collective ones. For fixed measurements, 
this statement holds only asymptotically. By relaxing 
this constrain and allowing Bayesian updating, which is 
arguably the simplest, easiest to implement, adaptive 
strategy, the statement holds for any finite number of 
copies. Furthermore, our approach provides very simple 
recursion relations [e.g., I|23l) . (|24|l . and (|25|l ] or even 
closed- form expressions [e.g., I|21|l: recall the change of 



notation <fi r — <fi x ] for the angles <fi r defining the op- 
timal von-Neumann measurements and the discrimina- 
tion/error probabilities. 

Finally, we would like to point out that the general 
adaptive set up of Sec. IIII Bl where measurements are 
allowed to depend on histories or lists of outcomes (rather 
than just the very last outcome) has a unique solution 
which can be regarded as Bayesian updating. Despite all 
this generality, the optimal solution is as simple as can 
be. 
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